Exploiting Aggregate Properties of Bilingual Dictionaries For Distinguishing Senses of English Words and Inducing English Sense Clusters

نویسندگان

  • Charles Schafer
  • David Yarowsky
چکیده

We propose a novel method for inducing monolingual semantic hierarchies and sense clusters from numerous foreign-language-to-English bilingual dictionaries. The method exploits patterns of non-transitivity in translations across multiple languages. No complex or hierarchical structure is assumed or used in the input dictionaries: each is initially parsed into the “lowest common denominator” form, which is to say, a list of pairs of the form (foreign word, English word). We then propose a monolingual synonymy measure derived from this aggregate resource, which is used to derive multilinguallymotivated sense hierarchies for monolingual English words, with potential applications in word sense classification, lexicography and statistical machine translation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combining Machine Readable Lexical Resources and Bilingual Corpora for Broad Word Sense Disambiguation

This paper describes a new approach to word sense disambiguation (WSD) based on automatically acquired "word sense division. The semantically related sense entries in a bilingual dictionary are arranged in clusters using a heuristic labeling algorithm to provide a more complete and appropriate sense division for WSD. Multiple translations of senses serve as outside information for automatic tag...

متن کامل

Word Sense Disambiguation for All Words without Hard Labor

While the most accurate word sense disambiguation systems are built using supervised learning from sense-tagged data, scaling them up to all words of a language has proved elusive, since preparing a sense-tagged corpus for all words of a language is time-consuming and human labor intensive. In this paper, we propose and implement a completely automatic approach to scale up word sense disambigua...

متن کامل

Word etymology in monolingual and bilingual dictionaries: lexicographers2 versus EFL learners2 perspectives

This paper deals with the treatment of word etymology in monolingual and bilingual dictionaries. It also investigates EFL learners' attitudes towards the importance of etymology for understanding the meaning of the words they look up in dictionaries. The data were collected through tasks of looking up Arabic loan words in English in monolingual and bilingual dictionaries. The results indicate t...

متن کامل

A Quantitative Analysis of the English Lexicon in Wiktionaries and WordNet

A quantitative analysis of the English lexicon was done in the paper. The three electronic dictionaries are under examination: the English Wiktionary, WordNet, and the Russian Wiktionary. It was calculated the quantity of English words and meanings (senses) in these dictionaries. The distribution of words for each part of speech, the quantity of monosemous and polysemous words and the distribut...

متن کامل

Development of Myanmar-English Bilingual WordNet like Lexicon

A bilingual concept lexicon is of significance for Information Extraction (IE), Machine Translation (MT), Word Sense Disambiguation (WSD) and the like. Myanmar-English Bilingual WordNet like Lexicon (MEBWL) is developed to fulfill the requirements of Language Acquisition (LA). However, it is reasonably difficult to build such a lexicon is quite challenging in time and cost consuming. To overcom...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004